123 research outputs found

    Flow cytometric determination of genome size in European sunbleak Leucaspius delineatus (Heckel, 1843)

    Get PDF
    The aim of this study was to compare DNA content in hepatocyte and erythrocyte nuclei of the European sunbleak, Leucaspius delineatus, in relation to nuclear and cell size by means of flow cytometry and fluorescence microscopy. The DNA standards, chicken and rainbow trout erythrocytes, were prepared in parallel with both cell types, with initial separation of liver cells in pepsin solution followed by cell filtering. Standards and investigated cells were stained with a mixture of propidium iodide, citric acid, and Nonidet P40 in the presence of RNAse, and fluorescence of at least 50,000 nuclei was analyzed by flow cytometry. Average cell size was determined by flow cytometry, using fresh cell suspension in relation to latex beads of known diameter. The size of nuclei was examined on the basis of digital micrographs obtained by fluorescence microscopy after nuclei staining with DAPI. The sunbleak’s erythrocyte nuclei contain 2.25 ± 0.06 pg of DNA, whereas the hepatocyte nuclei contain 2.46 ± 0.06 pg of DNA. This difference in DNA content was determined spectroscopically using isolated DNA from the two cell types. The modal diameters of the erythrocytes and hepatocytes were estimated to be 5.1 ± 0.2 and 22.3 ± 5.0 μm, respectively, and the corresponding modal dimensions of their nuclei (measured as surface area) were 15.2 and 21.4 μm2, respectively. The nucleoplasmic index, as calculated from diameters estimated from surface area of nuclear profiles, was 2.51 for the erythrocytes compared with 0.08 for hepatocytes

    Mutational Biases and Selective Forces Shaping the Structure of Arabidopsis Genes

    Get PDF
    Recently features of gene expression profiles have been associated with structural parameters of gene sequences in organisms representing a diverse set of taxa. The emerging picture indicates that natural selection, mediated by gene expression profiles, has a significant role in determining genic structures. However the current situation is less clear in plants as the available data indicates that the effect of natural selection mediated by gene expression is very weak. Moreover, the direction of the patterns in plants appears to contradict those observed in animal genomes. In the present work we analized expression data for >18000 Arabidopsis genes retrieved from public datasets obtained with different technologies (MPSS and high density chip arrays) and compared them with gene parameters. Our results show that the impact of natural selection mediated by expression on genes sequences is significant and distinguishable from the effects of regional mutational biases. In addition, we provide evidence that the level and the breadth of gene expression are related in opposite ways to many structural parameters of gene sequences. Higher levels of expression abundance are associated with smaller transcripts, consistent with the need to reduce costs of both transcription and translation. Expression breadth, however, shows a contrasting pattern, i.e. longer genes have higher breadth of expression, possibly to ensure those structural features associated with gene plasticity. Based on these results, we propose that the specific balance between these two selective forces play a significant role in shaping the structure of Arabidopsis genes

    PROMPT: a protein mapping and comparison tool

    Get PDF
    BACKGROUND: Comparison of large protein datasets has become a standard task in bioinformatics. Typically researchers wish to know whether one group of proteins is significantly enriched in certain annotation attributes or sequence properties compared to another group, and whether this enrichment is statistically significant. In order to conduct such comparisons it is often required to integrate molecular sequence data and experimental information from disparate incompatible sources. While many specialized programs exist for comparisons of this kind in individual problem domains, such as expression data analysis, no generic software solution capable of addressing a wide spectrum of routine tasks in comparative proteomics is currently available. RESULTS: PROMPT is a comprehensive bioinformatics software environment which enables the user to compare arbitrary protein sequence sets, revealing statistically significant differences in their annotation features. It allows automatic retrieval and integration of data from a multitude of molecular biological databases as well as from a custom XML format. Similarity-based mapping of sequence IDs makes it possible to link experimental information obtained from different sources despite discrepancies in gene identifiers and minor sequence variation. PROMPT provides a full set of statistical procedures to address the following four use cases: i) comparison of the frequencies of categorical annotations between two sets, ii) enrichment of nominal features in one set with respect to another one, iii) comparison of numeric distributions, and iv) correlation of numeric variables. Analysis results can be visualized in the form of plots and spreadsheets and exported in various formats, including Microsoft Excel. CONCLUSION: PROMPT is a versatile, platform-independent, easily expandable, stand-alone application designed to be a practical workhorse in analysing and mining protein sequences and associated annotation. The availability of the Java Application Programming Interface and scripting capabilities on one hand, and the intuitive Graphical User Interface with context-sensitive help system on the other, make it equally accessible to professional bioinformaticians and biologically-oriented users. PROMPT is freely available for academic users from

    Automating Genomic Data Mining via a Sequence-based Matrix Format and Associative Rule Set

    Get PDF
    There is an enormous amount of information encoded in each genome – enough to create living, responsive and adaptive organisms. Raw sequence data alone is not enough to understand function, mechanisms or interactions. Changes in a single base pair can lead to disease, such as sickle-cell anemia, while some large megabase deletions have no apparent phenotypic effect. Genomic features are varied in their data types and annotation of these features is spread across multiple databases. Herein, we develop a method to automate exploration of genomes by iteratively exploring sequence data for correlations and building upon them. First, to integrate and compare different annotation sources, a sequence matrix (SM) is developed to contain position-dependant information. Second, a classification tree is developed for matrix row types, specifying how each data type is to be treated with respect to other data types for analysis purposes. Third, correlative analyses are developed to analyze features of each matrix row in terms of the other rows, guided by the classification tree as to which analyses are appropriate. A prototype was developed and successful in detecting coinciding genomic features among genes, exons, repetitive elements and CpG islands

    Intergenic and Genic Sequence Lengths Have Opposite Relationships with Respect to Gene Expression

    Get PDF
    Eukaryotic genomes are mostly composed of noncoding DNA whose role is still poorly understood. Studies in several organisms have shown correlations between the length of the intergenic and genic sequences of a gene and the expression of its corresponding mRNA transcript. Some studies have found a positive relationship between intergenic sequence length and expression diversity between tissues, and concluded that genes under greater regulatory control require more regulatory information in their intergenic sequences. Other reports found a negative relationship between expression level and gene length and the interpretation was that there is selection pressure for highly expressed genes to remain small. However, a correlation between gene sequence length and expression diversity, opposite to that observed for intergenic sequences, has also been reported, and to date there is no testable explanation for this observation. To shed light on these varied and sometimes conflicting results, we performed a thorough study of the relationships between sequence length and gene expression using cell-type (tissue) specific microarray data in Arabidopsis thaliana. We measured median gene expression across tissues (expression level), expression variability between tissues (expression pattern uniformity), and expression variability between replicates (expression noise). We found that intergenic (upstream and downstream) and genic (coding and noncoding) sequences have generally opposite relationships with respect to expression, whether it is tissue variability, median, or expression noise. To explain these results we propose a model, in which the lengths of the intergenic and genic sequences have opposite effects on the ability of the transcribed region of the gene to be epigenetically regulated for differential expression. These findings could shed light on the role and influence of noncoding sequences on gene expression

    Large introns in relation to alternative splicing and gene evolution: a case study of Drosophila bruno-3

    Get PDF
    Background: Alternative splicing (AS) of maturing mRNA can generate structurally and functionally distinct transcripts from the same gene. Recent bioinformatic analyses of available genome databases inferred a positive correlation between intron length and AS. To study the interplay between intron length and AS empirically and in more detail, we analyzed the diversity of alternatively spliced transcripts (ASTs) in the Drosophila RNA-binding Bruno-3 (Bru-3) gene. This gene was known to encode thirteen exons separated by introns of diverse sizes, ranging from 71 to 41,973 nucleotides in D. melanogaster. Although Bru-3's structure is expected to be conducive to AS, only two ASTs of this gene were previously described. Results: Cloning of RT-PCR products of the entire ORF from four species representing three diverged Drosophila lineages provided an evolutionary perspective, high sensitivity, and long-range contiguity of splice choices currently unattainable by high-throughput methods. Consequently, we identified three new exons, a new exon fragment and thirty-three previously unknown ASTs of Bru-3. All exon-skipping events in the gene were mapped to the exons surrounded by introns of at least 800 nucleotides, whereas exons split by introns of less than 250 nucleotides were always spliced contiguously in mRNA. Cases of exon loss and creation during Bru-3 evolution in Drosophila were also localized within large introns. Notably, we identified a true de novo exon gain: exon 8 was created along the lineage of the obscura group from intronic sequence between cryptic splice sites conserved among all Drosophila species surveyed. Exon 8 was included in mature mRNA by the species representing all the major branches of the obscura group. To our knowledge, the origin of exon 8 is the first documented case of exonization of intronic sequence outside vertebrates. Conclusion: We found that large introns can promote AS via exon-skipping and exon turnover during evolution likely due to frequent errors in their removal from maturing mRNA. Large introns could be a reservoir of genetic diversity, because they have a greater number of mutable sites than short introns. Taken together, gene structure can constrain and/or promote gene evolution

    Gene Expression in Chicken Reveals Correlation with Structural Genomic Features and Conserved Patterns of Transcription in the Terrestrial Vertebrates

    Get PDF
    Background - The chicken is an important agricultural and avian-model species. A survey of gene expression in a range of different tissues will provide a benchmark for understanding expression levels under normal physiological conditions in birds. With expression data for birds being very scant, this benchmark is of particular interest for comparative expression analysis among various terrestrial vertebrates. Methodology/Principal Findings - We carried out a gene expression survey in eight major chicken tissues using whole genome microarrays. A global picture of gene expression is presented for the eight tissues, and tissue specific as well as common gene expression were identified. A Gene Ontology (GO) term enrichment analysis showed that tissue-specific genes are enriched with GO terms reflecting the physiological functions of the specific tissue, and housekeeping genes are enriched with GO terms related to essential biological functions. Comparisons of structural genomic features between tissue-specific genes and housekeeping genes show that housekeeping genes are more compact. Specifically, coding sequence and particularly introns are shorter than genes that display more variation in expression between tissues, and in addition intergenic space was also shorter. Meanwhile, housekeeping genes are more likely to co-localize with other abundantly or highly expressed genes on the same chromosomal regions. Furthermore, comparisons of gene expression in a panel of five common tissues between birds, mammals and amphibians showed that the expression patterns across tissues are highly similar for orthologuous genes compared to random gene pairs within each pair-wise comparison, indicating a high degree of functional conservation in gene expression among terrestrial vertebrates. Conclusions - The housekeeping genes identified in this study have shorter gene length, shorter coding sequence length, shorter introns, and shorter intergenic regions, there seems to be selection pressure on economy in genes with a wide tissue distribution, i.e. these genes are more compact. A comparative analysis showed that the expression patterns of orthologous genes are conserved in the terrestrial vertebrates during evolutio

    EST Analysis of Ostreococcus lucimarinus, the Most Compact Eukaryotic Genome, Shows an Excess of Introns in Highly Expressed Genes

    Get PDF
    Background: The genome of the pico-eukaryotic (bacterial-sized) prasinophyte green alga Ostreococcus lucimarinus has one of the highest gene densities known in eukaryotes, yet it contains many introns. Phylogenetic studies suggest this unusually compact genome (13.2 Mb) is an evolutionarily derived state among prasinophytes. The presence of introns in the highly reduced O. lucimarinus genome appears to be in opposition to simple explanations of genome evolution based on unidirectional tendencies, either neutral or selective. Therefore, patterns of intron retention in this species can potentially provide insights into the forces governing intron evolution. Methodology/Principal Findings: Here we studied intron features and levels of expression in O. lucimarinus using expressed sequence tags (ESTs) to annotate the current genome assembly. ESTs were assembled into unigene clusters that were mapped back to the O. lucimarinus Build 2.0 assembly using BLAST and the level of gene expression was inferred from the number of ESTs in each cluster. We find a positive correlation between expression levels and both intron number (R = +0.0893, p =,0.0005) and intron density (number of introns/kb of CDS; R = +0.0753, p =,0.005). Conclusions/Significance: In a species with a genome that has been recently subjected to a great reduction of non-coding DNA, these results imply the existence of selective/functional roles for introns that are principally detectable in highly expressed genes. In these cases, introns are likely maintained by balancing the selective forces favoring their maintenanc

    Regular Patterns for Proteome-Wide Distribution of Protein Abundance across Species

    Get PDF
    A proteome of the bio-entity, including cell, tissue, organ, and organism, consists of proteins of diverse abundance. The principle that determines the abundance of different proteins in a proteome is of fundamental significance for an understanding of the building blocks of the bio-entity. Here, we report three regular patterns in the proteome-wide distribution of protein abundance across species such as human, mouse, fly, worm, yeast, and bacteria: in most cases, protein abundance is positively correlated with the protein's origination time or sequence conservation during evolution; it is negatively correlated with the protein's domain number and positively correlated with domain coverage in protein structure, and the correlations became stronger during the course of evolution; protein abundance can be further stratified by the function of the protein, whereby proteins that act on material conversion and transportation (mass category) are more abundant than those that act on information modulation (information category). Thus, protein abundance is intrinsically related to the protein's inherent characters of evolution, structure, and function

    Ribosomal DNA Deletions Modulate Genome-Wide Gene Expression: “rDNA–Sensitive” Genes and Natural Variation

    Get PDF
    The ribosomal rDNA gene array is an epigenetically-regulated repeated gene locus. While rDNA copy number varies widely between and within species, the functional consequences of subtle copy number polymorphisms have been largely unknown. Deletions in the Drosophila Y-linked rDNA modifies heterochromatin-induced position effect variegation (PEV), but it has been unknown if the euchromatic component of the genome is affected by rDNA copy number. Polymorphisms of naturally occurring Y chromosomes affect both euchromatin and heterochromatin, although the elements responsible for these effects are unknown. Here we show that copy number of the Y-linked rDNA array is a source of genome-wide variation in gene expression. Induced deletions in the rDNA affect the expression of hundreds to thousands of euchromatic genes throughout the genome of males and females. Although the affected genes are not physically clustered, we observed functional enrichments for genes whose protein products are located in the mitochondria and are involved in electron transport. The affected genes significantly overlap with genes affected by natural polymorphisms on Y chromosomes, suggesting that polymorphic rDNA copy number is an important determinant of gene expression diversity in natural populations. Altogether, our results indicate that subtle changes to rDNA copy number between individuals may contribute to biologically relevant phenotypic variation
    corecore